Minimax Gaussian Classification & Clustering

نویسندگان

  • Tianyang Li
  • Xinyang Yi
  • Constantine Caramanis
  • Pradeep Ravikumar
چکیده

We present minimax bounds for classification and clustering error in the setting where covariates are drawn from a mixture of two isotropic Gaussian distributions. Here, we define clustering error in a discriminative fashion, demonstrating fundamental connections between classification (supervised) and clustering (unsupervised). For both classification and clustering, our lower bounds show that without enough samples, the best any classifier or clustering rule can do is close to random guessing. For classification, as part of our upper bound analysis, we show that Fisher’s linear discriminant achieves a fast minimax rate Θ(1/n) with enough samples n. For clustering, as part of our upper bound analysis, we show that a clustering rule constructed using principal component analysis achieves the minimax rate with enough samples. We also provide lower and upper bounds for the high-dimensional sparse setting where the dimensionality of the covariates p is potentially larger than the number of samples n, but where the difference between the Gaussian means is sparse.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Minimax Theory for High-dimensional Gaussian Mixtures with Sparse Mean Separation

While several papers have investigated computationally and statistically efficient methods for learning Gaussian mixtures, precise minimax bounds for their statistical performance as well as fundamental limits in high-dimensional settings are not well-understood. In this paper, we provide precise information theoretic bounds on the clustering accuracy and sample complexity of learning a mixture...

متن کامل

UDC 519.244.3 An Asymptotic Minimax Theorem for Gaussian Two-Armed Bandit

The asymptotic minimax theorem for Bernoulli two-armed bandit problem states that minimax risk has the order N as N → ∞, where N is the control horizon, and provides the estimates of the factor. For Gaussian twoarmed bandit with unit variances of one-step incomes and close expectations, we improve the asymptotic minimax theorem as follows: the minimax risk is approximately equal to 0.637N as N ...

متن کامل

Réduction de dimension en statistique et application en imagerie hyper-spectrale

This thesis deals with high dimensional statistical analysis. We focus on three different problems motivated by medical applications : curve classification, pixel classification and clustering in hyperspectral images. Our approaches are deeply linked with statistical testing procedures (multiple testing, minimax testing, robust testing, and functional testing) and learning theory. Both are intr...

متن کامل

FINITENESS PROPERTIES OF LOCALE COHOMOLOGY MODULES FOR (I;J)- MINIMAX MODULES

ABSTRACT. Let R be a commutative noetherian ring, I and J are two ideals of R. Inthis paper we introduce the concept of (I;J)- minimax R- module, and it is shown thatif M is an (I;J)- minimax R- module and t a non-negative integer such that HiI;J(M) is(I;J)- minimax for all i

متن کامل

Robust Method for E-Maximization and Hierarchical Clustering of Image Classification

We developed a new semi-supervised EM-like algorithm that is given the set of objects present in eachtraining image, but does not know which regions correspond to which objects. We have tested thealgorithm on a dataset of 860 hand-labeled color images using only color and texture features, and theresults show that our EM variant is able to break the symmetry in the initial solution. We compared...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017